Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Oct 9, 2025

docs: clarify v2 JSON extraction format and add real-world example

Summary

Updates the v2 JSON extraction documentation to address user confusion about format changes from v1 to v2. Users were experiencing API errors because they expected the v1 jsonOptions parameter to work in v2, but v2 requires the schema to be embedded directly inside the format object.

Key changes:

  • Added prominent Note explaining v2 API format changes at the top of the page
  • Added new "Real-world example" section with company information extraction using firecrawl.dev
  • Created 4 new code snippets (cURL, Python, Node.js, output) demonstrating the correct v2 format
  • Updated "JSON format options" section to emphasize v2 differences and clarify that jsonOptions doesn't exist
  • Enhanced parameter descriptions to be more specific about v2 requirements

Review & Testing Checklist for Human

This is a yellow risk PR - documentation changes that need verification against live API behavior.

  • Test the cURL example manually - Copy the cURL command and verify it returns the expected JSON structure when called against the live v2 API
  • Verify Python SDK example works - Test the Python code with the actual Firecrawl Python SDK to ensure the format is correct
  • Check Node.js SDK example - Confirm the JavaScript example works with the @mendable/firecrawl-js package
  • Validate output accuracy - Check if the example output values ("Turn websites into LLM-ready data", etc.) actually match what firecrawl.dev returns
  • Confirm v2 API behavior - Verify that jsonOptions truly doesn't exist in v2 and that the schema-in-format-object structure is correct

Notes

- Add prominent note about v2 API format changes at top of page
- Add new 'Real-world example' section demonstrating complex nested schema extraction
- Update JSON format options section to emphasize v2 differences from v1
- Create comprehensive event extraction examples (cURL, Python, Node.js) using Gemeinde Grünwald calendar
- Clarify that jsonOptions parameter doesn't exist in v2 API
- Schema must be embedded directly in format object: formats: [{type: 'json', schema: {...}}]

Co-Authored-By: [email protected] <[email protected]>
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

- Replace Gemeinde Grünwald event examples with firecrawl.dev company info
- Update schema to use company_mission, supports_sso, is_open_source, is_in_yc
- Simplify examples across cURL, Python, and Node.js
- Update output to match new schema format

Co-Authored-By: [email protected] <[email protected]>
- Rename JsonSchema to CompanyInfo to avoid confusion with v1 naming
- Use .model_json_schema() for proper Pydantic model conversion
- Standardize all examples to use https://firecrawl.dev

Co-Authored-By: [email protected] <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant